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Abstract —We consider a compressive hyperspectral imaging recon¬ 
struction problem, where three-dimensional spatio-spectral information 
about a scene is sensed by a coded aperture snapshot spectral imager 
(CASSI). The approximate message passing (AMP) framework is 
utilized to reconstruct hyperspectral images from CASSI measurements, 
and an adaptive Wiener filter is employed as a three-dimensional 
image denoiser within AMP. We call our algorithm “AMP-3D-Wiener.” 
The simulation results show that AMP-3D-Wiener outperforms existing 
widely-used algorithms such as gradient projection for sparse recon¬ 
struction (GPSR) and two-step iterative shrinkage/thresholding (TwIST) 
given the same amount of runtime. Moreover, in contrast to GPSR and 
TwIST, AMP-3D-Wiener need not tune any parameters, which simplifies 
the reconstruction process. 

Index Terms —Approximate message passing, CASSI, compressive 
hyperspectral imaging, Wiener filtering. 

I. Introduction 

Motivation: A hyperspectral image is a three-dimensional (3D) 
image cube comprised of a collection of two-dimensional (2D) 
images (slices), where each 2D image is captured at a specific 
wavelength. Hyperspectral imaging allows us to analyze spectral 
information about each spatial point in a scene, and has applica¬ 
tions to areas such as medical imaging [1], remote sensing [2], 
geology [3], and astronomy [4], 

The imaging processes in conventional spectral imagers [5-7] 
take a long time, because they require scanning a number of 
zones linearly in proportion to the desired spatial and spectral 
resolution. To address the limitations of conventional spectral imag¬ 
ing techniques, many spectral imager sampling schemes based on 
compressive sensing [8-10] have been proposed [11-13]. The coded 
aperture snapshot spectral imager (CASSI) [11,14-16] is a popular 
compressive spectral imager and acquires image data from differ¬ 
ent wavelengths simultaneously, which significantly accelerates the 
imaging process. On the other hand, because the measurements from 
CASSI are highly compressive, reconstructing 3D image cubes from 
CASSI measurements becomes challenging. Moreover, because of 
the massive size of 3D image data, it is desirable to develop fast 
reconstruction algorithms in order to realize real time acquisition 
and processing. 

Related work: Several compressive sensing algorithms have been 
proposed to reconstruct image cubes from measurements acquired 
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by CASSI. One of the efficient algorithms is gradient projection 
for sparse reconstruction (GPSR) [17]. GPSR models hyperspectral 
image cubes as sparse in the Kronecker product of a 2D wavelet 
transform and a ID discrete cosine transform (DCT), and solves 
the fi-minimization problem to enforce sparsity in this transform 
domain. Wagadarikar et al. [14] employed total variation [18] 
as the regularizer in the two-step iterative shrinkage/thresholding 
(TwIST) framework [19], a modified and fast version of standard 
iterative shrinkage/thresholding. Apart from using the wavelet-DCT 
basis or total variation, one can learn a dictionary with which the 
image cubes can be sparsely represented [12,20]. However, these 
algorithms all need manual tuning of some parameters, which may 
be time consuming. 

Contributions: We develop a robust and fast reconstruction algo¬ 
rithm for CASSI using approximate message passing (AMP) [21]. 
AMP is an iterative algorithm that can apply image denoising at 
each iteration. Previously, we proposed a 2D compressive imaging 
reconstruction algorithm, AMP-Wiener [22], where an adaptive 
Wiener filter was applied as the image denoiser within AMP. In 
this paper, AMP-Wiener is extended to 3D hyperspectral images, 
and we call it “AMP-3D-Wiener.” Our numerical results show that 
AMP-3D-Wiener reconstructs 3D image cubes with less runtime 
and higher quality than other reconstruction algorithms such as 
GPSR [17] and TwIST [14,19] (Figure 2), even when the regu¬ 
larization parameters in GPSR and TwIST have already been tuned. 
In fact, the regularization parameters in GPSR and TwIST need to 
be tuned carefully for each image cube, which requires to run GPSR 
and TwIST many times with different parameter values. Moreover, 
the improved reconstruction quality of AMP-3D-Wiener allows to 
reduce the number of shots taken by CASSI by a factor of 2 
(Figure 4). 

The remainder of the paper is arranged as follows. We review 
CASSI in Section II, and describe our AMP based compres¬ 
sive hyperspectral imaging reconstruction algorithm in Section III. 
Numerical results are presented in Section IV, while Section V 
concludes. 

H. Coded Aperture Snapshot Spectral Imager 

The coded aperture snapshot spectral imager (CASSI) [16] is 
a compressive spectral imaging system that collects far fewer 
measurements than traditional spectrometers. In CASSI, (i) the 2D 
spatial information of a scene is coded by an aperture, (ii) the coded 
spatial projections are spectrally shifted by a dispersive element, and 
(hi) the coded and shifted projections are detected by a 2D focal 
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Figure 1: The matrix H is presented for M = N = 8, L = 4:, and 
K — 2. The circled diagonal patterns that repeat horizontally corre¬ 
spond to the coded aperture pattern used in the first FPA shot. The 
second coded aperture pattern determines the next set of diagonals. 
Each FPA shot captures M{N + L + 1) = 104 measurements. 

plane array (FPA). For a 3D image cube of dimension M x N x L, 
the imaging process of CASSI can be written in a matrix-vector 
form, 

g = Hfo-|-z, (1) 

where fo G R" is the vectorized 3D image cube of dimension 
n = MNL, and vectors g G and z G R™ are the measurements 
and the additive noise, respectively. The matrix H G models 

the linear relationship between fo and g, and accounts for the effects 
of the coded aperture and the dispersive element. Recently, Arguello 
et al. [23] proposed a higher order model to characterize the CASSI 
system with greater precision. In this higher order CASSI model, 
each cubic voxel is shifted to an oblique voxel because of the 
continuous nature of the dispersion, and therefore the oblique voxel 
contributes to multiple measurements in the FPA. A sketch of the 
matrix H in the higher order CASSI model is depicted in Figure 1, 
where the image cube size is M = N — 8, and L — 4. The 
matrix H consists of a set of 3 diagonal patterns, accounting for 
the voxel energy impinging into 3 neighboring FPA pixels. The 3 
diagonal patterns repeat in the horizontal direction, each time with 
a unit downward shift, as many times as the number of spectral 
bands. Each diagonal pattern is the coded aperture itself after being 
column-wise vectorized. lust below, the next set of diagonal patterns 
is determined by the coded aperture pattern used in the subsequent 
shot. With K shots of CASSI, the number of measurements is 
m = KM{N + L -f 1) (see [23] for details). 

III. Proposed Algorithm 

The goal of our proposed algorithm is to reconstruct the image 
cube fo from its compressive measurements g, where the matrix H 
is known. In this section, we describe our algorithm in detail. The 
algorithm employs (i) approximate message passing (AMP) [21], 
an iterative algorithm for compressive sensing problems, and (ii) 
adaptive Wiener filtering, a hyperspectral image denoiser that can 
be applied within AMP. 

Scalar channels: Below we describe that the linear imaging sys¬ 
tem model in (1) can be converted to 3D image denoising in scalar 
channels. Therefore, we begin by defining scalar channels, where 


the noisy observations q of the image cube fo obey q = fo -F v, 
and V is the additive noise vector. Recovering fo from q is known 
as a 3D image denoising problem. 

Approximate message passing: AMP [21] has recently become 
a popular algorithm for solving signal reconstruction problems in 
linear systems as defined in (1). The AMP algorithm proceeds 
iteratively according to 

f‘+'=77t(H^r* + f*), (2) 

r* = g - Hf* + + f^-^)), (3) 

where is the transpose of H, R = m/n represents the 
measurement rate, rit{-) is a denoising function at the f-th iteration, 
r]'t{s) = §^Vt{s), and (u) = ^ m; for some vector u = 

{u\,U 2 ,... ,u„). The last term in (3) is called the “Onsager reaction 
term” [21,24] in statistical physics. In the t-th iteration, we obtain 
the vectors f‘ and r*. We highlight that the vector -t- f‘ 

in (2) can be regarded as a noise-corrupted version of fo in the t- 
th iteration with noise variance af, and therefore rjti') is a 3D 
image denoising function that is performed on a scalar channel 
q* = H^r* -f f* = fo + V*, where the noise level at is estimated 
by [25], 



i=l 


and rl denotes the i-th component of the vector r* in (3). 

Adaptive Wiener filter: We are now ready to describe our 3D 
image denoiser, which is the function rjti-) in (2). First, we want to 
find a sparsifying transform such that hyperspectral images have 
only a few large coefficients in this transform domain, because 
based on the sparsifying coefficients, some shrinkage function can 
be applied in order to suppress noise [26]. Inspired by Arguello 
and Arce [27], we apply a wavelet transform to each of the 2D 
images in a 3D cube, and then apply a DCT along the spectral 
dimension. That is, the sparsifying transform ^ can be expressed 
as a Kronecker product of a DCT transform $ and a 2D wavelet 
transform W, i.e., $ (g) W, and it can easily be shown that 

is an orthonormal transform. Our 3D image denoising is processed 
on the sparsifying coefficients = 4'q*. 

Our proposed 3D image denoiser is a modification of the adaptive 
Wiener filter in our previous work [22], which is inspired by Mihfak 
et al. [28]. Let ^ denote the i-th element of 0q. The coefficients Of 
of the estimated (denoised) image cube f* are obtained by Wiener 
filtering, 
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(4) 


where p,i^t and Pf j are the empirical mean and variance of Mq ^ 
within an appropriate wavelet subband, respectively. Note that 
in our previous work [22], the variance was estimated locally 
from the coefficients within a 5 x 5 window, and all coefficients 
had different variance values. In the current work, the variance 
is estimated from an entire subband, and the coefficients within 
each subband share the same variance. Such a denoiser has a 
simpler structure, and is likely to help prevent AMP from diverging. 
Taking the maximum between 0 and {uft — ^t) ensures that if 
the empirical variance Ut t of the noisy coefficients is smaller than 
the noise variance irf, then the corresponding noisy coefficients 







are set to 0. After obtaining the denoised coefficients Of from 
= 4'q*, the estimated image cube in the {t + l)-th iteration 
satisfies f‘+i = r?t(q*) = 

AMP-3D-Wiener: It has been discussed [22] that when the 
sparsifying transform is orthonormal, the derivative calculated in 
the transform domain is equivalent to the derivative in the image 
domain. According to (4), the derivative of the Wiener filter in the 
transform domain with respect to ^ is max{0,Pf_t — ^t} 
Because the sparsifying transform ^ is orthonormal, the Onsager 
term (3) can be calculated as 




1^ ■sp max{0,i2f^t - 


Basic AMP has been proved to converge with i.i.d. Gaussian 
matrices and scalar functions ? 7 t(-) [29], i.e., ^ only depends on 

its corresponding noisy coefficient 0^^ ^. Other AMP variants [30- 
32] have been proposed in order to encourage convergence for a 
broader class of measurement matrices. The matrix H defined in (1) 
is not i.i.d. Gaussian, but highly structured as shown in Figure 1, 
and the adaptive Wiener filter in (4) is not a scalar function owing 
to and Ff ( being functions of multiple noisy coefficients i. 
Unfortunately, AMP-3D-Wiener encounters divergence issues with 
this matrix H. We choose to apply “damping” [31,33], which 
resembles a technique used in Gaussian belief propagation [34], 
to solve for the divergence problems of AMP-3D-Wiener, because 
it is simple and only increases the runtime modestly. Specifically, 
damping is an extra step within AMP iterations that updates the 
values of r* and by weighted sums as shown in Lines 2 and 7 
of Algorithm 1. We will show in Section IV that AMP-3D-Wiener 
converges with an appropriate amount of damping, and AMP-3D- 
Wiener serves as a demonstration that non-scalar denoisers have 
promises in the AMP framework [22,35]. 


Algorithm 1 AMP-3D-Wiener 


Inputs: g, H, 0 < a < 1, maxiter 
Outputs: Pamp 

Initialization: = 0, r° = 0 

for t — 1 : maxiter do 

1) r‘=g-Hf‘ + ir‘-iiEr=i 

2) r* = a • r* -P (1 — a) • r*“^ 


lifO.i 




4) = 

5) Ol, = ^q* 

6) el, = 

7) f*+^ = a 


iel,i — 


^^Of -P (1 - a) • f‘ 


end for 

pmaxIter+1 
AMP — I 


IV. Numerical Results 

In this section, we compare the reconstruction quality and run¬ 
time of AMP-3D-Wiener, gradient projection for sparse reconstruc¬ 
tion (GPSR) [17], and two-step iterative shrinkage/thresholding 
(TwIST) [14,19]. In all experiments, we use the same coded 
aperture pattern for AMP-3D-Wiener, GPSR, and TwIST. In order 
to quantify the reconstruction quality of each algorithm, the peak 


signal to noise ratio (PSNR) of each 2D slice in reconstructed cubes 
is measured. 

In AMP, the damping parameter a is set to be 0.2. The choice 
of damping mainly depends on the structure of the imaging model 
in (1) hut not on the characteristics of the image cubes, and thus 
the value of the damping parameter a need not be tuned in our 
experiments. 

To reconstruct the image cube fo, GPSR and TwIST minimize 
objective functions of the form f = argminr ||jg — Hf||| -P j3 ■ 
4>{f), where </)(•) is a regularization function that characterizes the 
structure of the image cube fo, and /3 is a regularization parameter 
that balances the weights of the two terms in the objective function. 
In GPSR, = ||4'f||i; in TwIST, is the total variation 
function [18]. We select the optimal values of /3 for GPSR and 
TwIST manually, i.e., we run GPSR and TwIST with 5 — 10 different 
values of /?, and select the results with the highest PSNR. 

In order to compare the performance of AMP-3D-Wiener, GPSR, 
and TwIST, an image cube is experimentally acquired using a wide¬ 
band Xenon lamp as the illumination source, modulated by a visible 
monochromator spanning the spectral range between 448 nm and 
664 nm, and each waveband has 9 nm width. The image intensity 
was captured using a grayscale CCD camera, with pixel size 9.9 
fim, and 8 bits of intensity levels. The resulting test image cube is 
of size M X N = 256 x 256, and L — 24. 

Setting 1: The measurements g are captured with K — 2 shots 
such that the two coded apertures are complementary. Therefore, 
we ensure that the norm of each column in H (1) is similar. The 
measurement rate is m/n = KM{N + L + 1)/{MNL) « 0.09. 
Moreover, we add zero-mean Gaussian noise z to the measurements 
such that the signal to noise ratio (SNR) is 20 dB. The SNR is 
defined as 10 log]^Q(/ig/anoise) [27], where fj,g is the mean value of 
the measurements Hfo and anoise is the standard deviation of z. 

Figure 2 compares the reconstruction quality of AMP-3D-Wiener, 
GPSR, and TwIST within 450 seconds. Runtime is measured on 
a Dell OPTIPLEX 9010 running an Intel(R) CoreTM i7-860 with 
16GB RAM, and the environment is Matlab R2013a. In Figure 2, the 
horizontal axis represents runtime in seconds, and the vertical axis is 
the averaged PSNR over the 24 spectral bands. Although the PSNR 
of AMP-3D-Wiener oscillates during the first few iterations, which 
may be because the matrix H is ill-conditioned, it becomes stable 
after 50 seconds and reaches a higher level compared to the PSNRs 
of GPSR and TwIST at 50 seconds. After 450 seconds, the average 
PSNRs of the cubes reconstructed by AMP-3D-Wiener, GPSR, and 
TwIST are 26.16 dB, 23.46 dB and 25.10 dB, respectively. Figure 3 
displays the reconstructed cubes in the form of 2D RGB images, 
and we can see that AMP-3D-Wiener produces images with better 
quality; images reconstructed by GPSR and TwIST are blurrier. 

Note that in 450 seconds, AMP-3D-Wiener and GPSR run 
roughly 400 iterations, while TwIST runs around 200 iterations. 
Therefore, for the rest of the simulations, we run AMP-3D-Wiener 
and GPSR with 400 iterations, and TwIST with 200 iterations, so 
that all algorithms complete within the similar amount of time. 

Setting 2: In Setting I, the measurements are captured with 
K = 2 shots. We now test our algorithm on the setting where 
the number of shots varies from A' = 2toA' = 12 with pairwise 
complementary coded apertures. Specifically, we randomly generate 
the coded aperture for the fc-th shot for k = 1,3,5,7,9,11, and 
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Figure 2: Runtime versus average PSNR comparison of AMP-3D- 
Wiener, CPSR, and TwIST. Cube size is M = N = 256, and 
L = 24. The measurements are captured with K = 2 shots using 
complementary coded apertures, and the number of measurements is 
m = 143,872. 



Figure 3: Moving from left to right, the panels correspond to the 
groundtruth and image cubes reconstructed by AMP-3D-Wiener, 
GPSR, and TwIST. The 2D RGB images shown in this figure are 
converted from their corresponding 3D image cubes. (The target 
object presented in the experimental results was not endorsed by the 
trademark owners and it is used here as fair use to illustrate the qual¬ 
ity of reconstruction of compressive spectral image measurements. 
LEGO is a trademark of the LEGO Group, which does not sponsor, 
authorize or endorse the images in this paper. The LEGO Group. 
All Rights Reserved, http://aboutus.iego.com/enus/iegal-notice/fair- 
play/.) 

the coded aperture in the {k + l)-th shot is the complement of the 
aperture in the fc-th shot. In this setting, a moderate amount of noise 
(20 dB) is added to the measurements. Figure 4 presents the PSNR 
changes of the reconstructed cubes as the number of shots increases, 
and AMP-3D-Wiener consistently beats GPSR and TwIST. 

Test on natural scenes: We have also tested our algorithm on 
the dataset “natural scenes 2004” [36,37]. In this dataset, there are 
8 image cubes with L — 33 spectral bands and spatial resolution 
of around 1000 x 1000. To satisfy the dyadic constraint of the 2D 
wavelet, we crop their spatial resolution to be M = A’ = 512. The 
measurements are captured with K = 2 shots, and the measurement 
rate is m/n = KM{N + L + 1)/{MNL) « 0.065. We test for 
measurement noise levels such that the SNRs are 15 dB and 20 dB. 
The typical runtimes for AMP with 400 iterations, GPSR with 400 
iterations, and TwIST with 200 iterations are approximately 2, 800 
seconds. We run the algorithms on 5 different complementary coded 
apertures, and the average PSNR for each algorithm is shown in 
Table I. We highlight the highest PSNR among AMP-3D-Wiener, 
GPSR, and TwIST using bold fonts. It can be seen from Table I 
that AMP-3D-Wiener usually outperforms GPSR by 2 — 5 dB in 



Number of shots 

Figure 4: Number of shots versus average PSNR comparison of 
AMP-3D-Wiener, GPSR, and TwIST. Cube size is N ^ M = 256, 
and L = 24. The measurements are captured using pairwise comple¬ 
mentary coded apertures. 

terms of the PSNR, and outperforms TwIST by 0.2 — 4 dB. 



15 dB 

20 dB 


AMP 

GPSR 

TwIST 

AMP 

GPSR 

TwIST 

Scene 1 

30.46 

28.43 

30.14 

30.54 

28.52 

30.27 

Scene 2 

27.33 

24.74 

27.09 

27.74 

24.88 

27.42 

Scene 3 

33.29 

29.53 

31.87 

33.10 

29.57 

31.93 

Scene 4 

32.04 

27.00 

31.61 

32.25 

27.22 

31.97 

Scene 5 

27.44 

24.28 

26.46 

27.80 

24.64 

26.84 

Scene 6 

29.17 

25.02 

25.82 

30.06 

25.55 

26.27 

Scene 7 

36.36 

33.07 

33.76 

37.14 

33.54 

34.21 

Scene 8 

32.15 

28.19 

28.17 

32.99 

28.79 

28.57 


TABLE I: Average PSNRs of AMP-3D-Wiener, GPSR, and TwIST 
for the dataset “natural scene 2004” [36]. The spatial dimensions of 
the cubes are cropped to M = N = 512, and each cube has L — 33 
spectral bands. The measurements are captured with K — 2 shots, 
and the number of measurements is m = 559,104. 


V. Conclusion 

In this paper, we considered compressive hyperspectral imaging 
reconstruction in coded aperture snapshot spectral imager (CASSI) 
systems. Considering that the CASSI system is a great improvement 
in terms of imaging quality and acquisition speed over conventional 
spectral imaging techniques, it is desirable to further improve 
CASSI by accelerating the 3D image cube reconstruction process. 
Our proposed AMP-3D-Wiener used an adaptive Wiener filter as 
a 3D image denoiser within the approximate message passing 
(AMP) [21] framework. AMP-3D-Wiener was faster than existing 
image cube reconstruction algorithms, and also achieved better 
reconstruction quality. 
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